Scientific Report I – Scientific Activity during Your Fellowship Ii – Publication(s) during Your Fellowship

نویسنده

Raimo Launonen

چکیده

Cluster analysis in a large dataset is an interesting challenge in many fields of Science and Engineering. One important clustering approach is hierarchical clustering, which outputs hierarchical (nested) structures of a given dataset. The single-link is a distance-based hierarchical clustering method, which can find non-convex (arbitrary)-shaped clusters in a dataset. However, this method cannot be used for clustering large dataset as this method either keeps entire dataset in main memory or scans dataset multiple times from secondary memory of the machine. Both of them are potentially severe problems for cluster analysis in large datasets. One remedy for both problems is to create a summary of a given dataset efficiently, and the summary is subsequently used to speed up clustering methods in large datasets. In this paper, we propose a summarization scheme termed data sphere (ds) to speed up single-link clustering method in large datasets. The ds utilizes sequential leaders clustering method to collect important statistics of a given dataset. The single-link method is modified to work with ds. Modified clustering method is termed as summarized single-link (SSL). The SSL method is considerably faster than the single-link method applied directly to the dataset, and clustering results produced by SSL method are close to the clustering results produced by single-link method. The SSL method outperforms single-link using data bubble (summarization scheme) both in terms of clustering accuracy and computation time. To speed up proposed summarization scheme, a technique is introduced to reduce a large number of distance computations in leaders method. Experimental studies demonstrate effectiveness of the proposed summarization scheme for large datasets. 3. Bidyut Kr. Patra, Raimo Launonen, Ollikainen Ville, Sukumar Nandi. Exploiting Bhattacharyya similarity measure to diminish user cold-start problem in sparse data., The 17 th International Conference on Discovery Science (DS-2014), University of Ljubljana, Ljubljana, Slovenia, October, 2014. (Accepted) Abstract: Collaborative Filtering (CF) is one of the most successful approaches for personalized product recommendations. Neighborhood based collaborative filtering is an important class of CF, which is simple and efficient product recommender system widely used in commercial domain. However, neighbourhood based CF suffers from user cold-start problem. This problem becomes severe when neighborhood based CF is used in sparse rating data. In this paper, we propose an effective approach for similarity measure to address user cold-start problem in sparse rating dataset. Our proposed approach can find neighbors in the absence of co-rated items unlike existing measures. To show the effectiveness of this measure under cold-start scenario, we experimented with real rating datasets. Experimental results show that our approach based CF outperforms state-of-the art measures based CFs for cold-start problem. Collaborative Filtering (CF) is one of the most successful approaches for personalized product recommendations. Neighborhood based collaborative filtering is an important class of CF, which is simple and efficient product recommender system widely used in commercial domain. However, neighbourhood based CF suffers from user cold-start problem. This problem becomes severe when neighborhood based CF is used in sparse rating data. In this paper, we propose an effective approach for similarity measure to address user cold-start problem in sparse rating dataset. Our proposed approach can find neighbors in the absence of co-rated items unlike existing measures. To show the effectiveness of this measure under cold-start scenario, we experimented with real rating datasets. Experimental results show that our approach based CF outperforms state-of-the art measures based CFs for cold-start problem. ABCDE Scientific Report 4/5 4. Bidyut Kr. Patra, Ollikainen Ville, Raimo Launonen, Sukumar Nandi and Korra Sathya Babu A distance based incremental clustering for mining clusters of arbitrary shapes, The Fifth International Conference on Pattern Recognition and Machine Intelligence (PReMI 2013), Indian Statistical Institute, Kolkata, India, December, 2013. (Accepted) Abstract: Clustering has been recognized as one of the important tasks in data mining. One important class of clustering is distance based method. To reduce the computational and storage burden of the classical clustering methods, many distance based hybrid clustering methods have been proposed. However, these methods are not suitable for cluster analysis in dynamic environment where underlying data distribution and subsequently clustering structures change over time. In this paper, we propose a distance based incremental clustering method, which can find arbitrary shaped clusters in fast changing dynamic scenarios. Our proposed method is based on recently proposed al-SL method, which can successfully be applied to large static datasets. In the incremental version of the al-SL (termed as IncrementalSL), we exploit important characteristics of al-SL method to handle frequent updates of patterns to the given dataset. The IncrementalSL method can produce exactly same clustering results as produced by the al-SL method. To show the effectiveness of the IncrementalSL in dynamically changing database, we experimented with one synthetic and one real world datasets. Clustering has been recognized as one of the important tasks in data mining. One important class of clustering is distance based method. To reduce the computational and storage burden of the classical clustering methods, many distance based hybrid clustering methods have been proposed. However, these methods are not suitable for cluster analysis in dynamic environment where underlying data distribution and subsequently clustering structures change over time. In this paper, we propose a distance based incremental clustering method, which can find arbitrary shaped clusters in fast changing dynamic scenarios. Our proposed method is based on recently proposed al-SL method, which can successfully be applied to large static datasets. In the incremental version of the al-SL (termed as IncrementalSL), we exploit important characteristics of al-SL method to handle frequent updates of patterns to the given dataset. The IncrementalSL method can produce exactly same clustering results as produced by the al-SL method. To show the effectiveness of the IncrementalSL in dynamically changing database, we experimented with one synthetic and one real world datasets. 5. Apurva Pathak and Bidyut Kr. Patra. A knowledge resuse framework for improving novelty and diversity in recommendation, Communicated to 2 nd ACM IKDD Conference on Data Science (CoDS 2014), Bangalore, India, March, 2015 (Pending) III – ATTENDED SEMINARS, WORKHOPS, CONFERENCES 1. The 17 International Conference on Discovery Science (DS-2014), University of Ljubljana, Ljubljana, Slovenia, October, 2014. 2. Mini-workshop on Multiple Dimensions of Relevance, CWI, Netherlands, October 13, 2014. 3. ABCDE Seminar III, Athens, Greece, October, 2013. IV – RESEARCH EXCHANGE PROGRAMME (REP)

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

I – Scientific Activity during Your Fellowship

متن کامل

Gardner, York and Block

Dear Mr. York: As reader services correspondent at Fellowship headquarters, I would like to thank you for sending us your review article of Martin Gardner's Urantia. Of the half-dozen reviews we've read, yours is certainly the most serious, insightful and heartening. It was also gratifying to learn that you met your first Urantia Book believers during your visit to our information center at the...

متن کامل

The Interview Process for Gastroenterology Fellowship

Gastroenterology has become one of the most competitive fellowship in the field of internal medicine. 1 The application phase usually starts during the second half of PGY-2 year, and includes writing a personal statement, completing an application form via the electronic residency application service (ERAS), obtaining letters of recommendations, and selecting the programs to apply for. The inte...

متن کامل

Choosing a Post-Fellowship Path

While the gastroenterology career path is by no means short, for many of us it was relatively straightforward. If, like me, you knew you wanted to pursue gastroenterology, you moved from medical school to residency to fellowship relatively seamlessly, without need for significant pause or reflection. Yet during fellowship, there is a looming fork in the road. Mentors ask about your career goals...

متن کامل

Ndnano Undergraduate Research Fellowship (nurf) 2013 Project Summary

4) Briefly describe any new skills you acquired during your summer research: i) I have gained hands-on experience on the fabrication of nano-scale devices in the cleanroom. ii) I Learnt the use of software tools such as Mathematica for modeling and L-Edit for device layout. iii) I learnt to use Cascade 11000 Probe Station and IPE Probe Station for the electrical characterization of fabricated t...

متن کامل

The University of Kansas School of Medicine.

Thank you for considering the University of Kansas Medical Center for your residency/fellowship training. Choosing a residency program is one of the most important decisions of your medical career. In order to facilitate your application process at KUMC we recommend you review the following link http://www.kumc.edu/school-of-medicine/gme/prospective-residentsfellows.html which is located on the...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2014

Scientific Report I – Scientific Activity during Your Fellowship Ii – Publication(s) during Your Fellowship

نویسنده

چکیده

منابع مشابه

I – Scientific Activity during Your Fellowship

Gardner, York and Block

The Interview Process for Gastroenterology Fellowship

Choosing a Post-Fellowship Path

Ndnano Undergraduate Research Fellowship (nurf) 2013 Project Summary

The University of Kansas School of Medicine.

عنوان ژورنال:

اشتراک گذاری